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4.  INTRODUCTION 

The  purpose  of  this  project  is  to  develop  a  database  of  digital  mammograms  that  can  be 
used  by  researchers  who  (1)  are  trying  to  determine  the  image  quality  requirements  of  detectors  for 
digital  mammography;  (2)  are  developing  image  processing  techniques  to  optimize  the  displayed 
digital  mammogram;  (3)  are  developing  computerized  methods  for  analyzing  mammograms;  (4)  are 
studying  the  effects  of  image  compression  methods  on  image  quality;  (5)  are  developing  methods 
for  remote  transmission  of  mammograms;  and  (6)  are  studying  the  relationship  between  image 
quality  and  diagnostic  accuracy.  This  database  also  could  be  used  as  a  resource  for  teaching 
radiology  residents  and  for  testing  the  performance  levels  of  mammographers. 

The  specific  aims  of  this  proposal  are: 

1.  Collect  and  digitize  200  cases  in  each  of  5  different  categories,  mammograms  exhibiting:  (i) 
clustered  microcalcifications,  (ii)  masses,  (iii)  architectural  distortions,  (iv)  asymmetric  densities,  and 
(v)  no  lesions  (i.e.  normals). 

2.  Make  these  cases  available  to  other  researchers  either  over  computer  network  (Internet)  or  by 
sending  images  on  computer  tape  or  CD.  The  database  will  be  distributed  as  widely  as  possible  so 
that  comparisons  of  different  computerized  analysis  techniques  can  be  standardized. 

5.  BODY 

This  research  is  being  funded  as  an  infrastructure  award  and  as  such,  it  does  not  represent  a 
research  project  per  se.  That  is,  there  is  no  hypothesis  that  we  are  trying  to  prove.  Therefore,  this 
report  is  structured  slightly  different  from  a  normal  scientific  research  report  —  heavy  on  the 
method  and  light  on  actual  results.  In  this  project,  the  procedure  is  the  most  important  component, 
which  is  applied  continuously  in  a  straightforward  manner  to  achieve  the  goal  of  creating  the 
database  of  mammograms. 


Task  1:  Collect  and  digitize  mammograms.  (See  Figure  1.) 

We  now  have  630  cases  digitized  (see  Table  I,  at  the  end  of  the  report).  Most  of  these  cases 
still  need  to  the  truth  marked  accurately.  Currently,  because  of  the  high  volume  of  clinical  work  in 
the  radiology  department,  it  has  been  difficult  to  get  a  radiologist  to  mark  the  truth. 

It  is  important  that  this  database  be  used  in  such  a  way  so  as  comparisons  between  different 
algorithms  can  be  made.  This  is  the  main  motivation  for  creating  such  a  database.  To  allow  for 
valid  comparisons  to  be  made,  two  things  are  needed:  (i)  the  exact  same  cases  need  to  be  used  to 
measure  performance,  and  (ii)  the  same  criteria  for  scoring  the  results  need  to  be  used.  To  this  end, 
we  have  divided  the  database  into  testing  and  training  cases.  Furthermore,  we  are  preparing 
recommendations  for  scoring  the  results  based  on  some  preliminary  results  in  our  lab.  [1]  This 
information  will  be  released  with  the  database.  No  other  database  offers  such  instructions  for  its 
use  and  thus  comparisons  of  different  techniques  are  still  difficult  to  do.  We  are  in  the  process  of 
specifying  an  objective  scoring  method  based  on  radiologists’  input.  Once  we  have  the  scoring 
method  developed  and  truth  marked,  we  will  be  begin  to  distribute  the  database  widely.  We  expect 
that  this  can  be  done  in  the  late  winter  or  spring. 
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Figure  1.  A  flowchart  of  the  steps  required  to  collect,  digitize,  archive,  and  distribute 
the  mammographic  database.  The  'Full  Image'  is  the  whole  digitized  mammogram  at  full 
resolution.  The  'Reduced  Image'  is  a  minified  version  (reduced  resolution)  of  the  full 
image.  The  'ROI  Image'  is  a  portion  of  the  full  image  at  full  resolution. 
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Task  2:  Establish  protocol  for  transmitting  database 

We  originally  considered  the  ACR/NEMA  (DICOM)  image  format  for  our  database. 
However,  when  we  began  our  work,  the  ACR/NEMA  format  did  not  have  a  module  for 
mammography,  and  it  would  have  been  an  extensive  project  to  develop  one  at  that  time.  Recently  a 
digital  mammography  module  has  been  approved.  We  will  consider  whether  it  is  appropriate  to  use 
this  file  format,  since  many  users  of  our  database  may  not  have  a  DICOM  reader. 


Task  3:  Maintain  database  and  distribute  cases 


Maintenance  of  the  database  and  distribution  of  the  database  are  at  a  minimum  currently. 
These  tasks  will  become  important  shortly  as  cases  go  “on-line”.  Cases  are  being  archived  on  4- 
mm  tape  and  DVD. 


6.  KEY  RESEARCH  ACCOMPLISHMENTS 


•  Collection  of  630  mammographic  cases 

7.  REPORTABLE  OUTCOMES 


Presentations  and  Manuscripts: 

1.  Nishikawa  RM,  Wolverton  DE,  Schmidt  RE,  Johnson  RE,  Pisano,  ED,  Hemminger  BM:  A 
common  database  of  mammograms  for  research  in  digital  mammography.  U.S.  Army  Medical 
Research  and  Materiel  Command  Breast  Cancer  Research  Program:  An  Era  of  Hope, 
November,  1997,  Washington,  DC. 

2.  Nishikawa  RM,  Wolverton  DE,  Schmidt  RA,  Pisano  ED,  Hemminger  BM,  Moody  J:  A 
common  database  of  mammograms  for  research  in  digital  mammography.  In:  Doi  K,  Giger 
ML,  Nishikawa  RM,  and  Schmidt  RA  (eds.),  Digital  Mammography  ‘96.  (Amsterdam: 
Elsevier  Science)  435-438, 1996. 

3.  Nishikawa  RM:  Mammographic  databases.  Breast  Disease  10  137-150,  1998. 


8.  CONCLUSIONS 


The  development  of  a  common  database  of  mammograms  for  digital  mammography 
research  is  underway.  We  are  currently  establishing  an  objective  scoring  method  based  on 
radiologists’  input.  We  will  distribute  cases  and  the  scoring  method  in  order  to  insure  that 
meaningful  comparisons  between  different  techniques  can  be  made.  Such  comparisons  are 
currently  not  possible  or  are  problematic  with  any  existing  database. 
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Table  I.  Breakdown  of  cases  in  the  database  as  of  October  1/99. 


Type  of  Lesion 

Pathology 

#  of  Cases 

Mass 

Malignant 

116 

Mass 

Benign 

75 

Microcalcifications 

Malignant 

115 

Microcalcifications 

Benign 

87 

Asymmetric  Density 

Malignant 

30 

Asymmetric  Density 

Benign 

4 

Architectural 

Distortion 

Malignant 

32 

Architectural 

Distortion 

Benign 

3 

Normal 

168 

Total 

630 

